Skip to content

[docs] FOLLOWUPS: record CI wall-time investigation post-PR #72#77

Merged
trilamsr merged 1 commit into
mainfrom
chore/followups-record-ci-investigation
May 19, 2026
Merged

[docs] FOLLOWUPS: record CI wall-time investigation post-PR #72#77
trilamsr merged 1 commit into
mainfrom
chore/followups-record-ci-investigation

Conversation

@trilamsr

Copy link
Copy Markdown
Contributor

Summary

After PR #72's verify split (wall 427s → 242s actual, vs 155s projected), investigated whether setup-go cache sharing could close the gap. It can't: all three verify-* jobs already share the same cache key and hit it. The gap is verify-test bounded at 235s by make coverage-check (race-on go test ./... across 50+ packages). Records two follow-ups in docs/FOLLOWUPS.md so future work doesn't re-chase setup-go:

  1. coverage-check sharding across multiple jobs — real wall-time win (~150s target) but a real refactor (split test execution + coverage aggregation). Triggered by PR throughput becoming binding.
  2. Move build-tags and build from verify-static into verify-lint — saves ~17% runner-minutes via dedupe of compile work, zero wall-time. Cheap, but no speed benefit; triggered by runner-minute pressure.

Test plan

  • make doc-check clean (banned-phrase, link-resolution, (unverified) baseline).
  • FOLLOWUPS-only PR; no code or workflow changes.
No user-facing change. Records a CI wall-time investigation under docs/FOLLOWUPS.md so future agents don't re-chase setup-go cache sharing as a speedup lever.

Two entries under "Open — opportunistic / Tooling":

1. `coverage-check` sharding to close the projection gap. Investigation
   on 2026-05-18 confirmed all three verify-* jobs already share the
   same setup-go cache key and hit it. The gap is verify-test bounded
   at 235s by race-on test execution, not cache misses. Closing it
   requires sharding `go test ./...` across jobs — a real refactor,
   triggered by PR throughput becoming binding.

2. Compile-work dedupe between verify-lint and verify-static. `make
   build-tags` and `make build` repeat work verify-lint already did.
   Moving both saves ~17% runner-minutes, zero wall-time. Triggered by
   runner-minute pressure or unrelated verify-split work.

Records both so future investigations don't re-chase setup-go cache
sharing as a wall-time lever.

Signed-off-by: Tri Lam <trilamsr@gmail.com>
@trilamsr trilamsr enabled auto-merge (squash) May 19, 2026 06:24
@trilamsr trilamsr merged commit 03f7d9d into main May 19, 2026
8 checks passed
@trilamsr trilamsr deleted the chore/followups-record-ci-investigation branch May 19, 2026 06:27
trilamsr added a commit that referenced this pull request May 19, 2026
Doc-only PR (335 changed lines, all markdown). Branch is `worktree-*`
so the rubric anchor auto-resolves to `(no-milestone)` per the skill;
binding rubric is PRINCIPLES.md / NORTHSTARS.md / STYLE.md /
docs/STYLE-docs.md / MEMORY.md. The MILESTONES.md M13 amendments are
the diff content, not the rubric anchor for this review.

Pre-Phase-1 fix: prior commit on this branch was amended (8e488bc) to
remove the Co-Authored-By trailer and add Signed-off-by, so the loop
runs on a DCO-compliant branch. PR body cleanup deferred to Phase 5
per skill orchestration.

This commit defers one new follow-up to docs/FOLLOWUPS.md (P1.6,
non-CPython interpreter support boundary). No code changes; doc-only
diff means no TDD code cycle. Test-discipline maps to doc-check
banned-phrase lint, link integrity, and the grep-based evidence map
in the PR body — all clean.

Findings:

P1.1 [BLOCKER] DCO + AI-vocab pre-conditions on prior commit and PR body
  Beneficiary: repo-long-term (feedback_no_ai_mentions_in_repo)
  Location: PR-level
  Description: Prior commit dfe1f6b carried Co-Authored-By for AI and
    lacked Signed-off-by. PR body has "Generated with Claude Code"
    footer and references the review-loop skill by name.
  Proof: `git log -1 --format='%(trailers)' dfe1f6b` showed trailer
    pre-fix; `git log -1 --format='%(trailers)' 8e488bc` shows only
    Signed-off-by post-fix.
  Contradiction: ran. Same rules apply across the repo per
    feedback_no_ai_mentions_in_repo; no exception for draft PRs.
  TDD record: amend verified via the trailers grep above; force-push
    landed (8e488bc replaces dfe1f6b on origin/worktree-*). PR body
    sync deferred to Phase 5.
  Action: applied 8e488bc (commit), deferred Phase-5 (PR body)
  Rationale: standard DCO repair flow on a draft PR with no reviewers.

P1.2 [NIT] External-link rot defense absent
  Beneficiary: repo-long-term
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §References plus
    M13 citation corrections at MILESTONES.md:444-445
  Description: doc-check verifies on-disk links only; external GitHub
    URLs would rot silently if upstream issues move.
  Proof: doc-check output "252 markdown link(s) resolve to on-disk
    files" — no counter for external URLs.
  Contradiction: link-checker infrastructure is heavier than warranted
    for v0.1; industry standard accepts external link rot as best-effort.
  Action: explicitly-skipped
  Rationale: cost > value; manual review during reads suffices.
    PRINCIPLES §6 met (every cited claim has a URL).

P1.3 [NIT] Internal section-anchor brittleness
  Beneficiary: repo-long-term
  Location: MILESTONES.md M13 references "RFC-0009 §Design overview",
    "§Wire protocol", "§Safety properties"
  Description: doc-check verifies file existence, not anchor existence.
    Future RFC heading rename silently breaks back-references.
  Proof: `grep -c "RFC-0009 §" MILESTONES.md` returns 3; no automated
    test asserts those anchors resolve.
  Contradiction: at land time anchors resolve; future rename touches
    the author who is expected to grep for back-references.
  Action: deferred (same family as existing FOLLOWUPS rows on docs-
    parity helpers; no new row added)
  Rationale: low-frequency failure mode, manual mitigation sufficient.

P1.4 [NIT] Concurrent FOLLOWUPS.md edit conflict
  Beneficiary: repo-long-term
  Location: docs/FOLLOWUPS.md new section at file tail
  Description: My M13 section sits at end-of-file; concurrent PRs
    appending elsewhere merge cleanly; same-tail concurrent appends
    conflict.
  Proof: PR #77 recently edited FOLLOWUPS.md; my branch was rebased
    onto current main.
  Contradiction: standard git merge handles this; conflict is expected
    workflow, not a defect.
  Action: explicitly-skipped
  Rationale: not a defect.

P1.5 [NIT] fnv128a collision math n=10^7 not workload-cited
  Beneficiary: repo-long-term
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §Design overview
  Description: The (2.7e-6, 1.5e-25) numbers cite the birthday-bound
    formula `n^2 / (2 * 2^k)` but the `n=10^7 distinct (rank, stack)
    pairs per fleet-day` upper bound is asserted without measurement.
  Proof: Python repl: `1e7**2/(2*2**64) = 2.71e-6`;
    `1e7**2/(2*2**128) = 1.47e-25`. Math is correct; n is an estimate.
  Contradiction: reader can re-derive for any n trivially from the
    inline formula. RFC text is falsifiable in shape.
  Action: explicitly-skipped
  Rationale: PRINCIPLES §6 met — formula provided; n is a reasonable
    upper-bound estimate, tightening is polish not correctness.

P1.6 [CONCERN] Non-CPython interpreter support boundary undocumented
  Beneficiary: customer-adopter
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §Degraded modes
  Description: `faulthandler_missing` posture covers CPython-built-
    without-faulthandler but the RFC does not enumerate PyPy /
    MicroPython / GraalPy. An adopter on a non-CPython runtime would
    hit a confusing pip-install failure rather than a clean error.
  Proof: PyPy older versions: `import faulthandler` raises ImportError;
    GraalPy support is divergent across versions.
  Contradiction: ML training is overwhelmingly CPython; non-CPython in
    training is rare; Phase 4 operator docs are the right venue.
  TDD record: deferred — defended by operator docs (Phase 4).
  Action: deferred docs/FOLLOWUPS.md M13 section (this commit)
  Rationale: real constraint, low-frequency, doc-mitigation correct
    venue.

Validation-cycle stats:
- Findings rejected during contradict: 0
- Findings whose hard-proof did not reproduce: 0

TDD discipline stats:
- New code changes landed via failing-test-first: 0 (doc-only PR)
- Tests with mutation-verify outcome recorded: 0
- Tests reproducibility-verified: 0
- doc-check runs in draft cycle: 6 (all clean)

Edge cases enumerated:
- Internal link rot     → defended by doc-check (252 links resolve)
- External link rot     → P1.2 explicitly-skipped
- Anchor brittleness    → P1.3 deferred
- Concurrent edits      → P1.4 explicitly-skipped (standard merge)
- Collision math        → P1.5 explicitly-skipped (formula inline)
- Non-CPython runtime   → P1.6 deferred to FOLLOWUPS
- Rubric drift          → covered by P1.4

Rubric additions proposed in Phase 1: none. Initial rubric (repo-
standards) covers the doc-only diff.

Signed-off-by: Tri Lam <trilamsr@gmail.com>
trilamsr added a commit that referenced this pull request May 19, 2026
## Summary

Three observations from the recent session that didn't fit the
structured surfaces (already used for the load-bearing AGENTS.md entries
in PR #81 and the agent-internal notes in PR #82). Each captured via the
`learn-from-mistakes` flow and lands in its existing topic note.

**`.claude/notes/automation.md`** — *Memory captures rationale; hooks
enforce.* The pre-PR checklist personal memory landed mid-session was
followed by a lint failure shipped to CI within the hour. The same gap
closed reliably by the `PreToolUse` hook installed shortly after. For
any "always do X before Y" pattern, prefer the hook; the memory
documents *why* the hook exists.

**`docs/notes/ci.md`** — *Frame CI / perf projections as ranges, not
single numbers.* PR #72's 155s wall-time projection vs 242s actual cost
an investigation round (later landed in PR #77) because the projection's
setup-go-cache amortization assumption was unverified. Either verify
assumptions empirically before publishing the number, or frame as a
range.

**`.claude/notes/review-patterns.md`** — *Self-rate work, then write
criteria for the next grade up.* Forces articulation of measurable
improvements rather than free-form "anything else?". PR #76's B+ → A →
A+ came from two iterations of this exact pattern; each iteration closed
real structural gaps.

A fourth lesson — "fix existing tools before proposing new ones" — was
captured to personal memory (no PR, lives in
`~/.claude/projects/.../memory/`), not the repo, because it's a judgment
heuristic about my own decision-making rather than a repo-resident
convention.

## Test plan

- [x] `make doc-check` clean (banned-phrase lint, link resolution, all
gates).
- [x] `learn-from-mistakes` format check: banned vocabulary absent, no
first-person AI phrasing, no AI attribution, all three entries carry
`Anchor:` lines pointing at concrete PRs.
- [ ] CI on this PR exercises `doc-check` + `pr-lint`.

## Rollback

Each entry is a self-contained `### title` + body + `Anchor:` block at
the top of its file. No dependents elsewhere; reverting is a single Edit
per file.

```release-notes
NONE — documentation only. Three meta-lessons from a recent session retrospective land in their existing topic notes (`automation.md`, `ci.md`, `review-patterns.md`). No runtime behavior change.
```

Signed-off-by: Tri Lam <trilamsr@gmail.com>
trilamsr added a commit that referenced this pull request May 19, 2026
Doc-only PR (335 changed lines, all markdown). Branch is `worktree-*`
so the rubric anchor auto-resolves to `(no-milestone)` per the skill;
binding rubric is PRINCIPLES.md / NORTHSTARS.md / STYLE.md /
docs/STYLE-docs.md / MEMORY.md. The MILESTONES.md M13 amendments are
the diff content, not the rubric anchor for this review.

Pre-Phase-1 fix: prior commit on this branch was amended (8e488bc) to
remove the Co-Authored-By trailer and add Signed-off-by, so the loop
runs on a DCO-compliant branch. PR body cleanup deferred to Phase 5
per skill orchestration.

This commit defers one new follow-up to docs/FOLLOWUPS.md (P1.6,
non-CPython interpreter support boundary). No code changes; doc-only
diff means no TDD code cycle. Test-discipline maps to doc-check
banned-phrase lint, link integrity, and the grep-based evidence map
in the PR body — all clean.

Findings:

P1.1 [BLOCKER] DCO + AI-vocab pre-conditions on prior commit and PR body
  Beneficiary: repo-long-term (feedback_no_ai_mentions_in_repo)
  Location: PR-level
  Description: Prior commit dfe1f6b carried Co-Authored-By for AI and
    lacked Signed-off-by. PR body has "Generated with Claude Code"
    footer and references the review-loop skill by name.
  Proof: `git log -1 --format='%(trailers)' dfe1f6b` showed trailer
    pre-fix; `git log -1 --format='%(trailers)' 8e488bc` shows only
    Signed-off-by post-fix.
  Contradiction: ran. Same rules apply across the repo per
    feedback_no_ai_mentions_in_repo; no exception for draft PRs.
  TDD record: amend verified via the trailers grep above; force-push
    landed (8e488bc replaces dfe1f6b on origin/worktree-*). PR body
    sync deferred to Phase 5.
  Action: applied 8e488bc (commit), deferred Phase-5 (PR body)
  Rationale: standard DCO repair flow on a draft PR with no reviewers.

P1.2 [NIT] External-link rot defense absent
  Beneficiary: repo-long-term
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §References plus
    M13 citation corrections at MILESTONES.md:444-445
  Description: doc-check verifies on-disk links only; external GitHub
    URLs would rot silently if upstream issues move.
  Proof: doc-check output "252 markdown link(s) resolve to on-disk
    files" — no counter for external URLs.
  Contradiction: link-checker infrastructure is heavier than warranted
    for v0.1; industry standard accepts external link rot as best-effort.
  Action: explicitly-skipped
  Rationale: cost > value; manual review during reads suffices.
    PRINCIPLES §6 met (every cited claim has a URL).

P1.3 [NIT] Internal section-anchor brittleness
  Beneficiary: repo-long-term
  Location: MILESTONES.md M13 references "RFC-0009 §Design overview",
    "§Wire protocol", "§Safety properties"
  Description: doc-check verifies file existence, not anchor existence.
    Future RFC heading rename silently breaks back-references.
  Proof: `grep -c "RFC-0009 §" MILESTONES.md` returns 3; no automated
    test asserts those anchors resolve.
  Contradiction: at land time anchors resolve; future rename touches
    the author who is expected to grep for back-references.
  Action: deferred (same family as existing FOLLOWUPS rows on docs-
    parity helpers; no new row added)
  Rationale: low-frequency failure mode, manual mitigation sufficient.

P1.4 [NIT] Concurrent FOLLOWUPS.md edit conflict
  Beneficiary: repo-long-term
  Location: docs/FOLLOWUPS.md new section at file tail
  Description: My M13 section sits at end-of-file; concurrent PRs
    appending elsewhere merge cleanly; same-tail concurrent appends
    conflict.
  Proof: PR #77 recently edited FOLLOWUPS.md; my branch was rebased
    onto current main.
  Contradiction: standard git merge handles this; conflict is expected
    workflow, not a defect.
  Action: explicitly-skipped
  Rationale: not a defect.

P1.5 [NIT] fnv128a collision math n=10^7 not workload-cited
  Beneficiary: repo-long-term
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §Design overview
  Description: The (2.7e-6, 1.5e-25) numbers cite the birthday-bound
    formula `n^2 / (2 * 2^k)` but the `n=10^7 distinct (rank, stack)
    pairs per fleet-day` upper bound is asserted without measurement.
  Proof: Python repl: `1e7**2/(2*2**64) = 2.71e-6`;
    `1e7**2/(2*2**128) = 1.47e-25`. Math is correct; n is an estimate.
  Contradiction: reader can re-derive for any n trivially from the
    inline formula. RFC text is falsifiable in shape.
  Action: explicitly-skipped
  Rationale: PRINCIPLES §6 met — formula provided; n is a reasonable
    upper-bound estimate, tightening is polish not correctness.

P1.6 [CONCERN] Non-CPython interpreter support boundary undocumented
  Beneficiary: customer-adopter
  Location: docs/rfcs/0009-pyspy-receiver-scope.md §Degraded modes
  Description: `faulthandler_missing` posture covers CPython-built-
    without-faulthandler but the RFC does not enumerate PyPy /
    MicroPython / GraalPy. An adopter on a non-CPython runtime would
    hit a confusing pip-install failure rather than a clean error.
  Proof: PyPy older versions: `import faulthandler` raises ImportError;
    GraalPy support is divergent across versions.
  Contradiction: ML training is overwhelmingly CPython; non-CPython in
    training is rare; Phase 4 operator docs are the right venue.
  TDD record: deferred — defended by operator docs (Phase 4).
  Action: deferred docs/FOLLOWUPS.md M13 section (this commit)
  Rationale: real constraint, low-frequency, doc-mitigation correct
    venue.

Validation-cycle stats:
- Findings rejected during contradict: 0
- Findings whose hard-proof did not reproduce: 0

TDD discipline stats:
- New code changes landed via failing-test-first: 0 (doc-only PR)
- Tests with mutation-verify outcome recorded: 0
- Tests reproducibility-verified: 0
- doc-check runs in draft cycle: 6 (all clean)

Edge cases enumerated:
- Internal link rot     → defended by doc-check (252 links resolve)
- External link rot     → P1.2 explicitly-skipped
- Anchor brittleness    → P1.3 deferred
- Concurrent edits      → P1.4 explicitly-skipped (standard merge)
- Collision math        → P1.5 explicitly-skipped (formula inline)
- Non-CPython runtime   → P1.6 deferred to FOLLOWUPS
- Rubric drift          → covered by P1.4

Rubric additions proposed in Phase 1: none. Initial rubric (repo-
standards) covers the doc-only diff.

Signed-off-by: Tri Lam <trilamsr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant